{
  "nbformat": 4,
  "nbformat_minor": 0,
  "metadata": {
    "accelerator": "GPU",
    "colab": {
      "name": "hw1.ipynb",
      "provenance": [],
      "collapsed_sections": [],
      "toc_visible": true
    },
    "kernelspec": {
      "display_name": "Python 3",
      "language": "python",
      "name": "python3"
    },
    "language_info": {
      "codemirror_mode": {
        "name": "ipython",
        "version": 3
      },
      "file_extension": ".py",
      "mimetype": "text/x-python",
      "name": "python",
      "nbconvert_exporter": "python",
      "pygments_lexer": "ipython3",
      "version": "3.7.3"
    }
  },
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "5EJQLxb5_Zh8"
      },
      "source": [
        "# Homework 1\n",
        "In this homework, we will have you train some convolutional neural networks! We will start with a small dataset (CIFAR), and then work our way up to TinyImageNet! This homework originally written by Daniel Gordon with very minor modifications.\n",
        "\n",
        "# Initial Setup\n",
        "\n",
        "This will authenticate Colab to connect to your google drive account. This way you have space to store the datasets and won't have to redownload them every time. You'll also have stable storage to save your best performing networks."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "fLz_ijqeiLtY"
      },
      "source": [
        "from google.colab import drive\n",
        "drive.mount('/gdrive/')\n",
        "!ls /gdrive"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "HyuZjUD-764a"
      },
      "source": [
        "Run this code to navigate to the BASE_PATH directory and upload the homework2.tar file inside the BASE_PATH, then extract it.\n",
        "\n",
        "Have a look at [pt_util](https://gist.github.com/pjreddie/e531394d779af2da9201096af0dba78a). We moved some of the useful functions out of the python notebook to make it less cluttered, and added a few more useful functions.\n",
        "\n",
        "I made the BASE_PATH and DATA_PATH variables so you don't have to copy the same strings all over the place if you want to move the locations of the files around."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "v0siRtW65o0f"
      },
      "source": [
        "import os\n",
        "\n",
        "BASE_PATH = '/gdrive/My Drive/colab_files/hw1/'\n",
        "if not os.path.exists(BASE_PATH):\n",
        "    os.makedirs(BASE_PATH)\n",
        "DATA_PATH = BASE_PATH + 'tiny_imagenet/'\n",
        "\n",
        "!pwd\n",
        "!ls\n",
        "os.chdir(BASE_PATH)\n",
        "if not os.path.exists(DATA_PATH + 'train.h5'):\n",
        "    !wget https://courses.cs.washington.edu/courses/cse599g1/19au/files/homework2.tar\n",
        "    !tar -xvf homework2.tar\n",
        "    !rm homework2.tar\n",
        "!pwd\n",
        "!ls\n",
        "os.chdir('/content')"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "POv-u4zC5o0h"
      },
      "source": [
        "# CIFAR"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "1mbgx8Ku5o0i"
      },
      "source": [
        "## Part 1: Getting the Dataset\n",
        "Normally, we'd want to download our dataset first. Since PyTorch hosts the CIFAR dataset, we can load it using their helper function later.\n",
        "\n",
        "But, we'll change DATA_PATH to an empty directory to download the dataset to."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "zvCGTzKM5o0j"
      },
      "source": [
        "DATA_PATH = BASE_PATH + 'cifar/'"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "sbuC8Bt75o0l"
      },
      "source": [
        "## Part 2: Defining the Network\n",
        "Just like with MNIST last homework we need to define our network architecture. This time we will be using convolutional layers and maxpooling to extract features from our images before we feed those features into our final classifier.\n",
        "\n",
        "Check out the documentation for [nn.Conv2d](https://pytorch.org/docs/stable/generated/torch.nn.Conv2d.html):\n",
        "\n",
        "    nn.Conv2d(in_channels, out_channels, kernel_size, stride = 1, padding = 0,...)\n",
        "\n",
        "So the first parameter is the number of channels in the input. Second is the number of filters we'll use (AKA number of channels in the output). Third is kernel size. Next is stride and padding which are optional and have default values."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "yvK-kdPRav5L"
      },
      "source": [
        "import torch\n",
        "import torch.nn as nn\n",
        "from torchvision import datasets\n",
        "from torchvision import transforms\n",
        "import numpy as np\n",
        "import os\n",
        "import torch.nn.functional as F\n",
        "import torch.optim as optim\n",
        "import h5py\n",
        "import sys\n",
        "sys.path.append(BASE_PATH)\n",
        "import pt_util"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "NxV9_vQp5o0n"
      },
      "source": [
        "class CifarNet(nn.Module):\n",
        "    def __init__(self):\n",
        "        super(CifarNet, self).__init__()\n",
        "        self.conv1 = nn.Conv2d(3, 16, 3, stride=2, padding=1)\n",
        "        self.conv2 = nn.Conv2d(16, 32, 3, stride=2, padding=1)\n",
        "        self.conv3 = nn.Conv2d(32, 64, 3, stride=2, padding=1)\n",
        "        self.fc1 = nn.Linear(1024, 10)\n",
        "        self.accuracy = None\n",
        "\n",
        "    def forward(self, x):\n",
        "        x = self.conv1(x)\n",
        "        x = F.relu(x)\n",
        "        x = self.conv2(x)\n",
        "        x = F.relu(x)\n",
        "        x = self.conv3(x)\n",
        "        x = F.relu(x)\n",
        "        x = torch.flatten(x, 1)\n",
        "        x = self.fc1(x)\n",
        "        return x\n",
        "\n",
        "    def loss(self, prediction, label, reduction='mean'):\n",
        "        loss_val = F.cross_entropy(prediction, label.squeeze(), reduction=reduction)\n",
        "        return loss_val\n",
        "\n",
        "    def save_model(self, file_path, num_to_keep=1):\n",
        "        pt_util.save(self, file_path, num_to_keep)\n",
        "        \n",
        "    def save_best_model(self, accuracy, file_path, num_to_keep=1):\n",
        "        if self.accuracy == None or accuracy > self.accuracy:\n",
        "            self.accuracy = accuracy\n",
        "            self.save_model(file_path, num_to_keep)\n",
        "\n",
        "    def load_model(self, file_path):\n",
        "        pt_util.restore(self, file_path)\n",
        "\n",
        "    def load_last_model(self, dir_path):\n",
        "        return pt_util.restore_latest(self, dir_path)\n"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "yiJSkXjiKpDL"
      },
      "source": [
        "This time we are giving you the train and test functions, but feel free to modify them if you want. \n",
        "\n",
        "You may need to return some additional information for the logging portion of this assignment.\n"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "pmuzixXrkuYs"
      },
      "source": [
        "import time\n",
        "def train(model, device, train_loader, optimizer, epoch, log_interval):\n",
        "    model.train()\n",
        "    losses = []\n",
        "    for batch_idx, (data, label) in enumerate(train_loader):\n",
        "        data, label = data.to(device), label.to(device)\n",
        "        optimizer.zero_grad()\n",
        "        output = model(data)\n",
        "        loss = model.loss(output, label)\n",
        "        losses.append(loss.item())\n",
        "        loss.backward()\n",
        "        optimizer.step()\n",
        "        if batch_idx % log_interval == 0:\n",
        "            print('{} Train Epoch: {} [{}/{} ({:.0f}%)]\\tLoss: {:.6f}'.format(\n",
        "                time.ctime(time.time()),\n",
        "                epoch, batch_idx * len(data), len(train_loader.dataset),\n",
        "                100. * batch_idx / len(train_loader), loss.item()))\n",
        "    return np.mean(losses)\n",
        "\n",
        "def test(model, device, test_loader, log_interval=None):\n",
        "    model.eval()\n",
        "    test_loss = 0\n",
        "    correct = 0\n",
        "\n",
        "    with torch.no_grad():\n",
        "        for batch_idx, (data, label) in enumerate(test_loader):\n",
        "            data, label = data.to(device), label.to(device)\n",
        "            output = model(data)\n",
        "            test_loss_on = model.loss(output, label, reduction='sum').item()\n",
        "            test_loss += test_loss_on\n",
        "            pred = output.max(1)[1]\n",
        "            correct_mask = pred.eq(label.view_as(pred))\n",
        "            num_correct = correct_mask.sum().item()\n",
        "            correct += num_correct\n",
        "            if log_interval is not None and batch_idx % log_interval == 0:\n",
        "                print('{} Test: [{}/{} ({:.0f}%)]\\tLoss: {:.6f}'.format(\n",
        "                    time.ctime(time.time()),\n",
        "                    batch_idx * len(data), len(test_loader.dataset),\n",
        "                    100. * batch_idx / len(test_loader), test_loss_on))\n",
        "\n",
        "    test_loss /= len(test_loader.dataset)\n",
        "    test_accuracy = 100. * correct / len(test_loader.dataset)\n",
        "\n",
        "    print('\\nTest set: Average loss: {:.4f}, Accuracy: {}/{} ({:.0f}%)\\n'.format(\n",
        "        test_loss, correct, len(test_loader.dataset), test_accuracy))\n",
        "    return test_loss, test_accuracy"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "pKSGqIVD5o0r"
      },
      "source": [
        "## Part 3 and 4: Loading Data and Dataset Augmentation\n",
        "\n",
        "In the MNIST assignment, we didn't do any data augmentation because MNIST is kind of easy.\n",
        "\n",
        "In this assignment, you may find that data augmentation helps you a lot (or possibly hurts your performance).\n",
        "\n",
        "You can find a bunch preimplemented here https://pytorch.org/vision/stable/transforms.html and you can also do your own as seen in the tutorial from part 3.\n",
        "\n",
        "Play around with various data augmentations we will suggest some.\n",
        "\n",
        "- ToPILImage - This one is useful for a lot of the built in transforms which expect PIL images. \n",
        "- RandomHorizontalFlip\n",
        "- RandomResizedCrop\n",
        "- ColorJitter\n",
        "- RandomRotation\n",
        "- Normalize\n",
        "- Adding various types of noise\n",
        "- ToTensor - PyTorch expects the output from the dataset to be a tensor in CxHxW format.\n",
        "\n",
        "\n",
        "Note: You should be careful about which of these you apply to the test data. You usually don't want to apply noise to the test data, but you do want to normalize it in the same way for example."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "ZTn-I8775o0s"
      },
      "source": [
        "transform_train = transforms.Compose([\n",
        "    transforms.RandomCrop(32, padding=4),\n",
        "    transforms.RandomHorizontalFlip(),\n",
        "    transforms.ToTensor(),\n",
        "])\n",
        "\n",
        "transform_test = transforms.Compose([\n",
        "    transforms.ToTensor(),\n",
        "])\n",
        "\n",
        "data_train = datasets.CIFAR10(root=DATA_PATH, train=True, download=True, transform=transform_train)\n",
        "data_test = datasets.CIFAR10(root=DATA_PATH, train=False, download=True, transform=transform_test)"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "DYLMN3lf5o0t"
      },
      "source": [
        "## Part 5: Training the network\n",
        "Generally, it is useful to see how your training is going. Often people print their loss to make sure it goes down and their accuracy to make sure it goes up. But pictures are better than words. So for this part, you should record and plot the training loss, test loss, and test accuracy (and whatever else you want). \n",
        "\n",
        "We have created a very simple logging interface which essentially just saves and restores files via pickle in pt_util. Saving and restoring log data is important if you end your run early and want to continue where you left off rather than starting over.\n",
        "\n",
        "We have also provided a plot function which can plot a single line graph. You can use it and plot each value independently, or change it to plot them all in one graph. \n",
        "\n",
        "Try different network architectures and experiment with hyperparameters. You'll answer the questions at the bottom of the file based on these experiments.\n",
        "\n",
        "\n",
        "__Important note: Do not forget to title your graphs and label your axes. Plots are meaningless without a way to read them.__\n",
        "\n",
        "Second Note: It will be helpful for you when deciding what network structure, data augmentation, and such work to title the graphs accordingly so you remember.\n",
        "\n",
        "Third Note: The default setup right now saves and restores the network weights from a single folder. When you modify network architectures, you may want to increment your experiment version number so you start over with your training and log files."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "Hj-JBTfwk-4A"
      },
      "source": [
        "# Play around with these constants, you may find a better setting.\n",
        "BATCH_SIZE = 256\n",
        "TEST_BATCH_SIZE = 10\n",
        "EPOCHS = 20\n",
        "LEARNING_RATE = 0.01\n",
        "MOMENTUM = 0.9\n",
        "USE_CUDA = True\n",
        "SEED = 0\n",
        "PRINT_INTERVAL = 100\n",
        "WEIGHT_DECAY = 0.0005\n",
        "\n",
        "EXPERIMENT_VERSION = \"0.1\" # increment this to start a new experiment\n",
        "LOG_PATH = DATA_PATH + 'logs/' + EXPERIMENT_VERSION + '/'\n",
        "\n",
        "# Now the actual training code\n",
        "use_cuda = USE_CUDA and torch.cuda.is_available()\n",
        "\n",
        "#torch.manual_seed(SEED)\n",
        "\n",
        "device = torch.device(\"cuda\" if use_cuda else \"cpu\")\n",
        "print('Using device', device)\n",
        "import multiprocessing\n",
        "print('num cpus:', multiprocessing.cpu_count())\n",
        "\n",
        "kwargs = {'num_workers': multiprocessing.cpu_count(),\n",
        "          'pin_memory': True} if use_cuda else {}\n",
        "\n",
        "class_names = ['plane', 'car', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck']\n",
        "\n",
        "train_loader = torch.utils.data.DataLoader(data_train, batch_size=BATCH_SIZE,\n",
        "                                           shuffle=True, **kwargs)\n",
        "test_loader = torch.utils.data.DataLoader(data_test, batch_size=TEST_BATCH_SIZE,\n",
        "                                          shuffle=False, **kwargs)\n",
        "\n",
        "model = CifarNet().to(device)\n",
        "optimizer = optim.SGD(model.parameters(), lr=LEARNING_RATE, momentum=MOMENTUM, weight_decay=WEIGHT_DECAY)\n",
        "start_epoch = model.load_last_model(LOG_PATH)\n",
        "\n",
        "train_losses, test_losses, test_accuracies = pt_util.read_log(LOG_PATH + 'log.pkl', ([], [], []))\n",
        "test_loss, test_accuracy = test(model, device, test_loader)\n",
        "\n",
        "test_losses.append((start_epoch, test_loss))\n",
        "test_accuracies.append((start_epoch, test_accuracy))\n",
        "\n",
        "try:\n",
        "    for epoch in range(start_epoch, EPOCHS + 1):\n",
        "        train_loss = train(model, device, train_loader, optimizer, epoch, PRINT_INTERVAL)\n",
        "        test_loss, test_accuracy = test(model, device, test_loader)\n",
        "        train_losses.append((epoch, train_loss))\n",
        "        test_losses.append((epoch, test_loss))\n",
        "        test_accuracies.append((epoch, test_accuracy))\n",
        "        pt_util.write_log(LOG_PATH + 'log.pkl', (train_losses, test_losses, test_accuracies))\n",
        "        model.save_best_model(test_accuracy, LOG_PATH + '%03d.pt' % epoch)\n",
        "\n",
        "\n",
        "except KeyboardInterrupt as ke:\n",
        "    print('Interrupted')\n",
        "except:\n",
        "    import traceback\n",
        "    traceback.print_exc()\n",
        "finally:\n",
        "    model.save_model(LOG_PATH + '%03d.pt' % epoch, 0)\n",
        "    ep, val = zip(*train_losses)\n",
        "    pt_util.plot(ep, val, 'Train loss', 'Epoch', 'Error')\n",
        "    ep, val = zip(*test_losses)\n",
        "    pt_util.plot(ep, val, 'Test loss', 'Epoch', 'Error')\n",
        "    ep, val = zip(*test_accuracies)\n",
        "    pt_util.plot(ep, val, 'Test accuracy', 'Epoch', 'Error')\n"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "-BoS0b8T6o8L"
      },
      "source": [
        "##CIFAR Questions\n",
        "\n",
        "1. What design that you tried worked the best? This includes things like network design, learning rate, batch size, number of epochs, and other optimization parameters, data augmentation etc. What was the final train loss? Test loss? Test Accuracy? Provide the plots for train loss, test loss, and test accuracy.\n",
        "\n",
        "TODO: Answer here\n",
        "\n",
        "2. What design worked the worst (but still performed better than random chance)? Provide all the same information as question 1.\n",
        "\n",
        "TODO: Answer here\n",
        "\n",
        "3. Why do you think the best one worked well and the worst one worked poorly.\n",
        "\n",
        "TODO: Answer here\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "N_E8ys4b5o0w"
      },
      "source": [
        "---\n",
        "# TinyImageNet"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "BjUlLqslACEW"
      },
      "source": [
        "## Part 1: Upload the Dataset\n",
        "Change the DATA_PATH to the path of the TinyImageNet dataset we downloaded earlier."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "2UtsxBCpChPn"
      },
      "source": [
        "DATA_PATH = BASE_PATH + 'tiny_imagenet/'"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "b8NWTxZvJeAE"
      },
      "source": [
        "## Part 2: Defining the Network\n",
        "We're giving you no instructions on this part. Welcome to deep learning research! See if you can get above 40% accuracy. You probably want to use the Cross Entropy error again, but who knows, maybe you can find a better loss function. We will give you a few hints of things to try:\n",
        "\n",
        "- Maxpooling\n",
        "- Activation functions other than ReLU\n",
        "- Batch Norm\n",
        "- Dropout\n",
        "- Residual connections\n",
        "\n",
        "To define your network you'll have to figure out more about the Tiny ImageNet dataset. Specifically, what size are the images you'll be processing? How big is your label space? You can find this out by examining samples of your data."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "_9apqhaW5o0z"
      },
      "source": [
        "import torch\n",
        "import torch.nn as nn\n",
        "from torchvision import datasets\n",
        "from torchvision import transforms\n",
        "import numpy as np\n",
        "import os\n",
        "import torch.nn.functional as F\n",
        "import torch.optim as optim\n",
        "import h5py\n",
        "import sys\n",
        "sys.path.append(BASE_PATH)\n",
        "import pt_util"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "RA6lPT8Ceubk"
      },
      "source": [
        "class TinyImagenetNet(nn.Module):\n",
        "    def __init__(self):\n",
        "        super(TinyImagenetNet, self).__init__()\n",
        "        # TODO define the layers\n",
        "        raise NotImplementedError('Need to define the layers for your network')\n",
        "\n",
        "    def forward(self, x):\n",
        "        # TODO define the forward pass\n",
        "        raise NotImplementedError('Need to define the forward pass')\n",
        "        return x\n",
        "\n",
        "    def loss(self, prediction, label, reduction='mean'):\n",
        "        loss_val = F.cross_entropy(prediction, label.squeeze(), reduction=reduction)\n",
        "        return loss_val\n",
        "\n",
        "    def save_model(self, file_path, num_to_keep=1):\n",
        "        pt_util.save(self, file_path, num_to_keep)\n",
        "        \n",
        "    def save_best_model(self, accuracy, file_path, num_to_keep=1):\n",
        "        if self.accuracy == None or accuracy > self.accuracy:\n",
        "            self.accuracy = accuracy\n",
        "            self.save_model(file_path, num_to_keep)\n",
        "\n",
        "    def load_model(self, file_path):\n",
        "        pt_util.restore(self, file_path)\n",
        "\n",
        "    def load_last_model(self, dir_path):\n",
        "        return pt_util.restore_latest(self, dir_path)\n"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "kRLQ6rgz5o02"
      },
      "source": [
        "This time we are giving you the train and test functions, but feel free to modify them if you want. \n",
        "\n",
        "You may need to return some additional information for the logging portion of this assignment.\n"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "8wMEn1Xk5o02"
      },
      "source": [
        "import time\n",
        "def train(model, device, train_loader, optimizer, epoch, log_interval):\n",
        "    model.train()\n",
        "    losses = []\n",
        "    for batch_idx, (data, label) in enumerate(train_loader):\n",
        "        data, label = data.to(device), label.to(device)\n",
        "        optimizer.zero_grad()\n",
        "        output = model(data)\n",
        "        loss = model.loss(output, label)\n",
        "        losses.append(loss.item())\n",
        "        loss.backward()\n",
        "        optimizer.step()\n",
        "        if batch_idx % log_interval == 0:\n",
        "            print('{} Train Epoch: {} [{}/{} ({:.0f}%)]\\tLoss: {:.6f}'.format(\n",
        "                time.ctime(time.time()),\n",
        "                epoch, batch_idx * len(data), len(train_loader.dataset),\n",
        "                100. * batch_idx / len(train_loader), loss.item()))\n",
        "    return np.mean(losses)\n",
        "\n",
        "def test(model, device, test_loader, return_images=False, log_interval=None):\n",
        "    model.eval()\n",
        "    test_loss = 0\n",
        "    correct = 0\n",
        "\n",
        "    correct_images = []\n",
        "    correct_values = []\n",
        "\n",
        "    error_images = []\n",
        "    predicted_values = []\n",
        "    gt_values = []\n",
        "    with torch.no_grad():\n",
        "        for batch_idx, (data, label) in enumerate(test_loader):\n",
        "            data, label = data.to(device), label.to(device)\n",
        "            output = model(data)\n",
        "            test_loss_on = model.loss(output, label, reduction='sum').item()\n",
        "            test_loss += test_loss_on\n",
        "            pred = output.max(1)[1]\n",
        "            correct_mask = pred.eq(label.view_as(pred))\n",
        "            num_correct = correct_mask.sum().item()\n",
        "            correct += num_correct\n",
        "            if return_images:\n",
        "                if num_correct > 0:\n",
        "                    correct_images.append(data[correct_mask, ...].data.cpu().numpy())\n",
        "                    correct_value_data = label[correct_mask].data.cpu().numpy()[:, 0]\n",
        "                    correct_values.append(correct_value_data)\n",
        "                if num_correct < len(label):\n",
        "                    error_data = data[~correct_mask, ...].data.cpu().numpy()\n",
        "                    error_images.append(error_data)\n",
        "                    predicted_value_data = pred[~correct_mask].data.cpu().numpy()\n",
        "                    predicted_values.append(predicted_value_data)\n",
        "                    gt_value_data = label[~correct_mask].data.cpu().numpy()[:, 0]\n",
        "                    gt_values.append(gt_value_data)\n",
        "            if log_interval is not None and batch_idx % log_interval == 0:\n",
        "                print('{} Test: [{}/{} ({:.0f}%)]\\tLoss: {:.6f}'.format(\n",
        "                    time.ctime(time.time()),\n",
        "                    batch_idx * len(data), len(test_loader.dataset),\n",
        "                    100. * batch_idx / len(test_loader), test_loss_on))\n",
        "    if return_images:\n",
        "        correct_images = np.concatenate(correct_images, axis=0)\n",
        "        error_images = np.concatenate(error_images, axis=0)\n",
        "        predicted_values = np.concatenate(predicted_values, axis=0)\n",
        "        correct_values = np.concatenate(correct_values, axis=0)\n",
        "        gt_values = np.concatenate(gt_values, axis=0)\n",
        "\n",
        "    test_loss /= len(test_loader.dataset)\n",
        "    test_accuracy = 100. * correct / len(test_loader.dataset)\n",
        "\n",
        "    print('\\nTest set: Average loss: {:.4f}, Accuracy: {}/{} ({:.0f}%)\\n'.format(\n",
        "        test_loss, correct, len(test_loader.dataset), test_accuracy))\n",
        "    if return_images:\n",
        "        return test_loss, test_accuracy, correct_images, correct_values, error_images, predicted_values, gt_values\n",
        "    else:\n",
        "        return test_loss, test_accuracy\n"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "EwMDBwoCDRS_"
      },
      "source": [
        "## Part 3: Loading Data\n",
        "PyTorch has a nice interface for dealing with a variety of data. You can read a good tutorial here https://pytorch.org/tutorials/beginner/data_loading_tutorial.html\n",
        "Your friendly neighborhood TAs have made it even easier by preprocessing the data into a nice format. The data you uploaded is stored using hdf5 files which can be accessed a lot like Numpy arrays using the h5py package. In each of the files, there is a \"dataset\" called 'images', and one called 'labels'. Read more about h5py here http://docs.h5py.org/en/latest/quick.html\n",
        "\n",
        "Hints:\n",
        "1. HDF5s don't support concurrent accesses.\n",
        "2. If you don't close the HDF5 file, you will still have problems with concurrency.\n",
        "3. One way to deal with concurrent accesses is to copy the entirety of the data into each process separately. Then each process accesses its own copy of the data. https://stackoverflow.com/questions/40449659/does-h5py-read-the-whole-file-into-memory\n",
        "4. Speed hint: With small datasets, it is almost always a good idea to cache the data to disk rather than continually read from files.\n"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "igPTIlBtk2vu"
      },
      "source": [
        "# Data loader\n",
        "class H5Dataset(torch.utils.data.Dataset):\n",
        "    def __init__(self, h5_file, transform=None):\n",
        "        self.transform = transform\n",
        "        self.h5_file = h5py.File(h5_file, 'r')\n",
        "        self.images = self.h5_file['images'][:]\n",
        "        self.labels = torch.LongTensor(self.h5_file['labels'][:])\n",
        "        \n",
        "    def __len__(self):\n",
        "        return self.labels.shape[0]\n",
        "      \n",
        "    def __getitem__(self, idx):\n",
        "        data = self.images[idx]\n",
        "        label = self.labels[idx]\n",
        "        \n",
        "        if self.transform:\n",
        "            data = self.transform(data)\n",
        "        return (data, label)\n"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "tfWvYzBaKcO7"
      },
      "source": [
        "## Part 4: Dataset Augmentation\n",
        "In the MNIST assignment, we didn't do any data augmentation because MNIST is kind of easy.\n",
        "\n",
        "In this assignment, you may find that data augmentation helps you a lot (or possibly hurts your performance).\n",
        "\n",
        "You can find a bunch preimplemented here https://pytorch.org/docs/stable/torchvision/transforms.html and you can also do your own as seen in the tutorial from part 3.\n",
        "\n",
        "Play around with various data augmentations we will suggest some.\n",
        "\n",
        "- ToPILImage - This one is useful for a lot of the built in transforms which expect PIL images. \n",
        "- RandomHorizontalFlip\n",
        "- RandomResizedCrop\n",
        "- ColorJitter\n",
        "- RandomRotation\n",
        "- Normalize\n",
        "- Adding various types of noise\n",
        "- ToTensor - PyTorch expects the output from the dataset to be a tensor in CxHxW format.\n",
        "\n",
        "\n",
        "Note: You should be careful about which of these you apply to the test data. You usually don't want to apply noise to the test data, but you do want to normalize it in the same way for example.\n"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "-5JeXSx9LIx3"
      },
      "source": [
        "train_transforms = transforms.Compose([\n",
        "    transforms.ToPILImage(),\n",
        "    transforms.RandomHorizontalFlip(),\n",
        "    transforms.ToTensor(),\n",
        "    ])\n",
        "\n",
        "test_transforms = transforms.Compose([\n",
        "    transforms.ToTensor(),\n",
        "    ])\n",
        "\n",
        "data_train = H5Dataset(DATA_PATH + 'train.h5', transform=train_transforms)\n",
        "print(len(data_train))\n",
        "data_test = H5Dataset(DATA_PATH + 'val.h5', transform=test_transforms)\n",
        "print(len(data_test))\n"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "piz_PoP-N5mK"
      },
      "source": [
        "## Part 5: Training the network\n",
        "Generally, it is useful to see how your training is going. Often people print their loss to make sure it goes down and their accuracy to make sure it goes up. But pictures are better than words. So for this part, you should record and plot the training loss, test loss, and test accuracy (and whatever else you want). \n",
        "\n",
        "We have created a very simple logging interface which essentially just saves and restores files via pickle in pt_util. Saving and restoring log data is important if you end your run early and want to continue where you left off rather than starting over.\n",
        "\n",
        "We have also provided a plot function which can plot a single line graph. You can use it and plot each value independently, or change it to plot them all in one graph. \n",
        "\n",
        "\n",
        "__Important note: Do not forget to title your graphs and label your axes. Plots are meaningless without a way to read them.__\n",
        "\n",
        "Second Note: It will be helpful for you when deciding what network structure, data augmentation, and such work to title the graphs accordingly so you remember.\n",
        "Third Note: The default setup right now saves and restores the network weights from a single folder. When you modify network architectures, you may want to save the resulting files in different folders (with appropriate names).\n",
        "\n",
        "We also provided a function for showing some results, because it's not satisfying to train a neural net, you also want to see what it can do! This can also be useful for figuring out what your network is doing well, and what it is failing at. This type of error analysis is very common when training neural networks.\n"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "5Jd_asDR5o07"
      },
      "source": [
        "# Play around with these constants, you may find a better setting.\n",
        "BATCH_SIZE = 256\n",
        "TEST_BATCH_SIZE = 10\n",
        "EPOCHS = 20\n",
        "LEARNING_RATE = 0.01\n",
        "MOMENTUM = 0.9\n",
        "USE_CUDA = True\n",
        "SEED = 0\n",
        "PRINT_INTERVAL = 100\n",
        "WEIGHT_DECAY = 0.0005\n",
        "\n",
        "EXPERIMENT_VERSION = \"0.1\" # increment this to start a new experiment\n",
        "LOG_PATH = DATA_PATH + 'logs/' + EXPERIMENT_VERSION + '/'\n",
        "\n",
        "# Now the actual training code\n",
        "use_cuda = USE_CUDA and torch.cuda.is_available()\n",
        "\n",
        "#torch.manual_seed(SEED)\n",
        "\n",
        "device = torch.device(\"cuda\" if use_cuda else \"cpu\")\n",
        "print('Using device', device)\n",
        "import multiprocessing\n",
        "print('num cpus:', multiprocessing.cpu_count())\n",
        "\n",
        "kwargs = {'num_workers': multiprocessing.cpu_count(),\n",
        "          'pin_memory': True} if use_cuda else {}\n",
        "\n",
        "class_names = [line.strip().split(', ') for line in open(DATA_PATH + 'class_names.txt')]\n",
        "name_to_class = {line[1]: line[0] for line in class_names}\n",
        "class_names = [line[1] for line in class_names]\n",
        "\n",
        "train_loader = torch.utils.data.DataLoader(data_train, batch_size=BATCH_SIZE,\n",
        "                                           shuffle=True, **kwargs)\n",
        "test_loader = torch.utils.data.DataLoader(data_test, batch_size=TEST_BATCH_SIZE,\n",
        "                                          shuffle=False, **kwargs)\n",
        "\n",
        "model = TinyImagenetNet().to(device)\n",
        "optimizer = optim.SGD(model.parameters(), lr=LEARNING_RATE, momentum=MOMENTUM, weight_decay=WEIGHT_DECAY)\n",
        "start_epoch = model.load_last_model(LOG_PATH)\n",
        "\n",
        "train_losses, test_losses, test_accuracies = pt_util.read_log(LOG_PATH + 'log.pkl', ([], [], []))\n",
        "test_loss, test_accuracy, correct_images, correct_val, error_images, predicted_val, gt_val = test(model, device, test_loader, True)\n",
        "\n",
        "correct_images = pt_util.to_scaled_uint8(correct_images.transpose(0, 2, 3, 1))\n",
        "error_images = pt_util.to_scaled_uint8(error_images.transpose(0, 2, 3, 1))\n",
        "pt_util.show_images(correct_images, ['correct: %s' % class_names[aa] for aa in correct_val])\n",
        "pt_util.show_images(error_images, ['pred: %s, actual: %s' % (class_names[aa], class_names[bb]) for aa, bb in zip(predicted_val, gt_val)])\n",
        "\n",
        "test_losses.append((start_epoch, test_loss))\n",
        "test_accuracies.append((start_epoch, test_accuracy))\n",
        "\n",
        "try:\n",
        "    for epoch in range(start_epoch, EPOCHS + 1):\n",
        "        train_loss = train(model, device, train_loader, optimizer, epoch, PRINT_INTERVAL)\n",
        "        test_loss, test_accuracy, correct_images, correct_val, error_images, predicted_val, gt_val = test(model, device, test_loader, True)\n",
        "        train_losses.append((epoch, train_loss))\n",
        "        test_losses.append((epoch, test_loss))\n",
        "        test_accuracies.append((epoch, test_accuracy))\n",
        "        pt_util.write_log(LOG_PATH + '.pkl', (train_losses, test_losses, test_accuracies))\n",
        "        model.save_best_model(test_accuracy, LOG_PATH + '%03d.pt' % epoch)\n",
        "\n",
        "\n",
        "except KeyboardInterrupt as ke:\n",
        "    print('Interrupted')\n",
        "except:\n",
        "    import traceback\n",
        "    traceback.print_exc()\n",
        "finally:\n",
        "    model.save_model(LOG_PATH + '%03d.pt' % epoch, 0)\n",
        "    ep, val = zip(*train_losses)\n",
        "    pt_util.plot(ep, val, 'Train loss', 'Epoch', 'Error')\n",
        "    ep, val = zip(*test_losses)\n",
        "    pt_util.plot(ep, val, 'Test loss', 'Epoch', 'Error')\n",
        "    ep, val = zip(*test_accuracies)\n",
        "    pt_util.plot(ep, val, 'Test accuracy', 'Epoch', 'Error')\n",
        "    correct_images = pt_util.to_scaled_uint8(correct_images.transpose(0, 2, 3, 1))\n",
        "    error_images = pt_util.to_scaled_uint8(error_images.transpose(0, 2, 3, 1))\n",
        "    pt_util.show_images(correct_images, ['correct: %s' % class_names[aa] for aa in correct_val])\n",
        "    pt_util.show_images(error_images, ['pred: %s, actual: %s' % (class_names[aa], class_names[bb]) for aa, bb in zip(predicted_val, gt_val)])\n"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "WR2hmzE97Bsg"
      },
      "source": [
        "## TinyImageNet Questions\n",
        "\n",
        "1. What design that you tried worked the best? How many epochs were you able to run it for? Provide the same information from CIFAR question 1.\n",
        "\n",
        "TODO: Answer here\n",
        "\n",
        "2. Were you able to use larger/deeper networks on TinyImageNet than you used on CIFAR and increase accuracy? If so, why? If not, why not?\n",
        "\n",
        "TODO: Answer here\n",
        "\n",
        "3. The real ImageNet dataset has significantly larger images. How would you change your network design if the images were twice as large? How about smaller than Tiny ImageNet? How do you think your accuracy would change? This is open-ended, but we want a more thought-out answer than \"I'd resize the images\" or \"I'd do a larger pooling stride.\" You don't have to write code to test your hypothesis.\n",
        "\n",
        "TODO: Answer here"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "uUzPdeQLMGQh"
      },
      "source": [
        "# Turn-in\n",
        "\n",
        "Download your `hw1.ipynb` and put it in your `uwnet` repository."
      ]
    }
  ]
}
